This report explores a dataset containing campaign contributions to all candidates in the 2016 presidential election by Michigan residents. As a former Michigander, I’m particularly interested in Michigan’s current political climate.
## 'data.frame': 163765 obs. of 18 variables:
## $ cmte_id : chr "C00580100" "C00580100" "C00580100" "C00580100" ...
## $ cand_id : chr "P80001571" "P80001571" "P80001571" "P80001571" ...
## $ cand_nm : chr "Trump, Donald J." "Trump, Donald J." "Trump, Donald J." "Trump, Donald J." ...
## $ contbr_nm : chr "RODDEN, JULIE" "SELLERS, KENNETH" "SELLERS, KENNETH" "SELLERS, KENNETH" ...
## $ contbr_city : chr "TAYLOR" "BROWNSTOWN" "BROWNSTOWN" "BROWNSTOWN" ...
## $ contbr_st : chr "MI" "MI" "MI" "MI" ...
## $ contbr_zip : int 48180 48134 48134 48134 48134 493029352 48067 48382 490096510 498539211 ...
## $ contbr_employer : chr "INFORMATION REQUESTED" "DOD" "DOD" "DOD" ...
## $ contbr_occupation: chr "INFORMATION REQUESTED" "QA SPECILAIST" "QA SPECILAIST" "QA SPECILAIST" ...
## $ contb_receipt_amt: num 76.4 -20 -20 -20 -20 ...
## $ contb_receipt_dt : chr "09-NOV-16" "06-OCT-16" "13-OCT-16" "20-OCT-16" ...
## $ receipt_desc : chr "" "" "" "" ...
## $ memo_cd : chr "X" "X" "X" "X" ...
## $ memo_text : chr "" "" "" "" ...
## $ form_tp : chr "SA18" "SA18" "SA18" "SA18" ...
## $ file_num : int 1146165 1146165 1146165 1146165 1146165 1091718 1146165 1146165 1077404 1091718 ...
## $ tran_id : chr "SA18.133973" "SA18.188012" "SA18.188013" "SA18.188014" ...
## $ election_tp : chr "G2016" "G2016" "G2016" "G2016" ...
## [1] 24
Our dataset consists of 18 variables, with 163,765 observations and 24 individual candidates that residents contributed to.
## # A tibble: 24 × 5
## cand_nm total avg median max
## <chr> <int> <dbl> <dbl> <dbl>
## 1 Clinton, Hillary Rodham 68774 94.91927 25.00 5400
## 2 Sanders, Bernard 47889 41.69198 27.00 2700
## 3 Trump, Donald J. 20934 140.83903 55.13 5000
## 4 Cruz, Rafael Edward 'Ted' 11306 81.30434 50.00 10800
## 5 Carson, Benjamin S. 9075 106.11952 50.00 5400
## 6 Rubio, Marco 1726 420.28802 100.00 5400
## 7 Paul, Rand 1012 197.83497 50.00 2700
## 8 Kasich, John R. 779 539.46098 200.00 2700
## 9 Bush, Jeb 768 1258.82096 1000.00 5400
## 10 Fiorina, Carly 551 188.77677 100.00 2700
## # ... with 14 more rows
The top Democratic candidates (Hillary Clinton, Bernie Sanders) received the largest number of donations, while the top three Republican candidates (Donald Trump, Ted Cruz, Ben Carson) received fewer overall donations. The median donation per Republican candidate was roughly double that of the Democrats.
## # A tibble: 1,238 × 5
## contbr_city total avg median max
## <chr> <int> <dbl> <dbl> <dbl>
## 1 ANN ARBOR 13962 102.66451 30.00 2700
## 2 GRAND RAPIDS 6616 112.36365 27.00 5400
## 3 DETROIT 4260 115.50208 27.00 5400
## 4 KALAMAZOO 3355 79.05810 33.55 2700
## 5 LANSING 3137 61.98019 25.00 2700
## 6 ROYAL OAK 2861 79.54320 25.00 2700
## 7 EAST LANSING 2631 88.04159 30.00 5400
## 8 BLOOMFIELD HILLS 2615 320.46039 80.00 5400
## 9 FARMINGTON HILLS 2289 131.50433 25.00 10800
## 10 YPSILANTI 2286 64.85254 27.00 2700
## # ... with 1,228 more rows
The five cities with the largest number of donations were Ann Arbor, Grand Rapids, Detroit, Kalamazoo and Lansing. Some cities, such as Bloomfield Hills and Grosse Pointe, had much larger median donations, which aligns with my understanding of the wealth in these communities.
A vast majority of donations were less than $500, which shows that it’s not just rich donors making a majority of the contributions at least in the context of individual donations.
For donations less than $100, you can notice peaks around $25, $50 and $75 which aligns with suggested donation amounts incremented by $25.
According to the Federal Election Commission, individual contribution limits are:
$2,700 per election to a Federal candidate or the candidate’s campaign committee. Notice that the limit applies separately to each election. Primaries, runoffs and general elections are considered separate elections.
Since presidential campaigns have primaries and general elections, you can see this reflected in the clustering of contribution counts at $2700 and $5400.
Above is a scale comparison of the candidates with the most and least contributions. The total contribution difference is vast; top candidates can have more than half a million contributions, while four of the five smallest candidates have less than 10 donations.
## # A tibble: 10,583 × 2
## contbr_employer total
## <chr> <int>
## 1 RETIRED 25141
## 2 N/A 24051
## 3 NONE 11788
## 4 SELF-EMPLOYED 8505
## 5 INFORMATION REQUESTED 6644
## 6 NOT EMPLOYED 6276
## 7 UNIVERSITY OF MICHIGAN 3824
## 8 SELF EMPLOYED 3504
## 9 SELF 2555
## 10 MICHIGAN STATE UNIVERSITY 1604
## # ... with 10,573 more rows
## # A tibble: 5,748 × 2
## contbr_occupation total
## <chr> <int>
## 1 RETIRED 43876
## 2 NOT EMPLOYED 15306
## 3 INFORMATION REQUESTED 6675
## 4 PROFESSOR 3540
## 5 ATTORNEY 3032
## 6 TEACHER 2838
## 7 HOMEMAKER 2643
## 8 PHYSICIAN 2630
## 9 ENGINEER 2592
## 10 SALES 1370
## # ... with 5,738 more rows
Employment Observations
From the plot above, you can see a gradual climb in donations during 2015 and a steep climb from January to March 2016. There are three contribution peaks in 2016 — in March, July and October — each coinciding with an election event. There was the Michigan Primary on March 8th, the Democratic and Republican national conventions at the end of July and the general election on November 8th.
To me, the most interesting features in this dataset are ‘cand_nm’, ‘contbr_nm’, ‘contbr_zip’, ‘contb_receipt_amt’ and ‘contb_receipt_dt.’ I can use ‘contb_receipt_amt’ and ‘contb_receipt_dt’ for each contribution to investigate temporal and candidate popularity trends. With ‘contbr_nm’ and ‘contbr_zip’, I can infer information about the contributor such as geographic location and gender.
A few takeaways from my analysis so far are that Democratic candidates received more contributions than Republicans and that a majority of contributions are small (less than $500). Other interesting things I noticed were that contribution amounts cluster around campaign maxs for larger donors and contributions peak around key election events (like primaries, conventions or scandals).
In order to determine each contributors gender, I used the ‘gender’ package in R, which encodes gender based on names and dates of birth using historical datasets. I decided to use a threshold of 75% certainty to assign a gender to a contributor.
Both men and women in Michigan donate more to Democrats than Republicans; however, women are significantly more likely to donate Democratic than men.
While Republicans actually donated more prior to 2016, the Democratic candidates received the overall majority of contributions. There were very few donations that were Libertarian or not marked for a particular party.
The above plot shows that Bernie Sanders actually had more support (based on number of contributions) from 2015 to well after the Democratic Primary. It was not until June 2016 that Hillary surpassed Bernie in number of donations, and not until after the DNC that Bernie support fully dried up. This coincides with my understanding of Bernie as a grassroots candidate with widespread and loyal supporters.
I think this plot is really interesting because it shows that Trump had almost no financial contributions prior to the Republican Primary on March 8, 2016. The leaders before the primary were Ben Carson and Ted Cruz. Trump’s support peaked at the Republican National Convention at the end of July and then steadily declined until the month before the general election when it had a small resurgence.
In this section, I added political affiliation to candidates and gender to contributors, which allowed me to see the relationship between these features and campaign contributions. There is a very strong correlation between female gender and Democratic Party contributions. Other discoveries include how cluttered the Republican race was prior to primaries, the lack of Trump support prior to the Republican Primary and the strong Bernie support before the Democratic National Convention.
Donation trends to Hillary vs. Trump were uniform in shape, but not in scale across the genders and time. One key difference is that women always donated more to Hillary than Trump, while men donated more to different candidates at different points in time. In the months right before the general election, men actually made more donations to Hillary!
Contribution Observations
From this boxplot, you can see that the interquartile range for contribution amounts is less than $100 across gender/political party and that Republican men give the most per contribution.
These two datasets were available on the Splitwise blog as downloads. According to the author, this data came from:
Two different Census APIs (the Decennial Census 2010 and the ACS 5-year 2007-2011), combined with the square-footage by ZCTA listings from the 2013 U.S. Gazetteer Files.
I believe that understanding the community type and economic prospects will be a strong indicator of candidate preference.
Classification (People per Square Mile):
These cutoffs are based on an analysis presented on the FiveThirtyEight blog:
Our analysis showed that the single best predictor of whether someone said his or her area was urban, suburban or rural was ZIP code density. Residents of ZIP codes with more than 2,213 people per square mile typically described their area as urban. Residents of neighborhoods with 102 to 2,213 people per square mile typically called their area suburban. In ZIP codes with fewer than 102 people per square mile, residents typically said they lived in a rural area.
Finally, I investigate how support for Hillary vs. Trump differed by gender and community type as a function of time. In general female voters favored Hillary, while male voters favored Trump. In terms of community type, city dwellers were more likely to support Hillary while rural Michiganders supported Trump.
From the above plots, you can see that a majority of contributors live in areas with unemployment rates lower than 20% and population density less than 3,000 people per square mile. For the zip codes that fall outside of that range, although there are still contributors from both parties, there are more Democrats in these areas with higher population density and higher unemployment.
## Density Per Sq Mile gender.male
## Density Per Sq Mile 1
## gender.male 0.01 1
## gender.female -0.02 -0.95
## cand_pty_aff.Democratic Party 0.16 -0.17
## cand_pty_aff.Republican Party -0.16 0.17
## gender.female cand_pty_aff.Democratic Party
## Density Per Sq Mile
## gender.male
## gender.female 1
## cand_pty_aff.Democratic Party 0.16 1
## cand_pty_aff.Republican Party -0.16 -0.99
## cand_pty_aff.Republican Party
## Density Per Sq Mile
## gender.male
## gender.female
## cand_pty_aff.Democratic Party
## cand_pty_aff.Republican Party 1
Correlation Observations
The geography of median donation seems to be relatively evenly distributed across the state with the exeception of Metro Detroit which is almost uniformly less than $30. The black regions are excluded zip codes since they have less than 10 donations.
On the above map, the darker zip codes code as Democratic and the lighter ones as Republican. As you can see, the Metro Detroit area and a majority of the coastal zip codes are Democratic, while the Republican patches are spread out around the state and tend to be landlocked.
The above scatterplot matrix shows each zip code as a point. This plot simply reinforces that Democrats live in more densely populated areas than Republicans, since the Democratic zip code have more total donations than the Republican zips.
In this section, I investigated support for Hillary vs. Trump over time, the size of campaign contributions for Democrats vs. Republicans as well as the relationship between population density and unemployment to party affiliation. I observed that Democratic indicators include high population density and female gender, while Republican indicators are low population density and male gender. Other observations include that Republicans (particularly Republican men) give larger donations and although men favored Trump to Hillary, men actually made more contributions to Hillary in the last few months leading up to the general election.
The above plot shows that Bernie Sanders actually had more support (based on number of contributions) from 2015 to well after the Democratic Primary. It was not until June 2016 that Hillary surpassed Bernie in number of donations, and not until after the DNC that Bernie support fully dried up. This coincides with my understanding of Bernie as a grassroots candidate with widespread and loyal supporters.
On the above map, the darker zip codes code as Democratic and the lighter ones as Republican. As you can see, the Metro Detroit area and a majority of the coastal zip codes are Democratic, while the Republican patches are spread out around the state and tend to be landlocked.
For the politcal party calculations, I select the dominant party (based on number of contributions) for each zip code and calculate the percentage of total contributions that party receives.
Finally, I investigate how support for Hillary vs. Trump differed by gender and community type as a function of time. In general female voters favored Hillary, while male voters favored Trump. In terms of community type, city dwellers were more likely to support Hillary while rural Michiganders supported Trump.
Rural women and urban men are particularly intriguing since they’re combinations of opposing political leanings demographics. They also have extremely similar contribution profiles. In general, they leaned toward supporting Hillary, but both had a moment in July (during convention season and the DNC email scandal) when they were more supportive of Trump.
contribution
candidate
contributor
zip code
I learned so much through this exploration! One discovery that came up throughout my analysis was that Michigan residents donated more (based on number of contributions) to Democratic candidates, but made larger individual contributions to Republicans. Another discovery was that Democrats tend to be urban, female and located on the coasts, while Republicans tend to be rural, male and landlocked. A final takeway was how clearly you can see campaign events (like primaries, conventions and scandals) in the data.
After the 2016 election, I heard so much about how women voted for Trump in droves, which supposedly won the election for him. What surprised me from this analysis is that although this may be true from a voter turnout level, it’s not what is reflected in the individual contributor data.
Other things that suprised me surrounded the campaign timeline and candidate support. Prior to this investigation, I didn’t realize what strong support Bernie enjoyed between the primaries and DNC, or what minimal support Trump had in comparison to other Republican candidates before the Republican Primary. Another timeframe surprise was that Michigan men actually made more total contributions to Hillary than Trump in the months leading up to the general election.
The main struggle that I encountered during this project was learning the idiosyncrasies of R and ggplot. I also encountered several situations where numeric columns were encoded as factor data, which led to issues where I couldn’t plot correctly because the values weren’t recognized as continuous. I solved these factor problems by simply converting to numeric values.
As next steps, I would like to more closely examine how community type (rural, suburban, urban) affects contribution outcomes using race, income and voting statistics. Once these features are included, I would like to delve into individual cities and regions within Michigan to better understand campaign donation and voting tendencies.